智能论文笔记

Fairness Based Energy-Efficient 3D Path Planning of a Portable Access Point: A Deep Reinforcement Learning Approach

Nithin Babu , Igor Donevski , Alvaro Valcarce , Petar Popovski , Jimmy Jessen Nielsen , Constantinos B. Papadias

分类：人工智能 | 机器学习 | 机器人

2022-08-10

在这项工作中，我们优化了基于无人机（UAV）的便携式接入点（PAP）的3D轨迹，该轨迹为一组接地节点（GNS）提供无线服务。此外，根据Peukert效果，我们考虑无人机电池的实用非线性电池放电。因此，我们以一种新颖的方式提出问题，代表了基于公平的能源效率度量的最大化，并被称为公平能源效率（费用）。费用指标定义了一个系统，该系统对每用户服务的公平性和PAP的能源效率都非常重要。该法式问题采用非凸面问题的形式，并具有不可扣除的约束。为了获得解决方案，我们将问题表示为具有连续状态和动作空间的马尔可夫决策过程（MDP）。考虑到解决方案空间的复杂性，我们使用双胞胎延迟的深层确定性政策梯度（TD3）参与者 - 批判性深入强化学习（DRL）框架来学习最大化系统费用的政策。我们进行两种类型的RL培训来展示我们方法的有效性：第一种（离线）方法在整个训练阶段保持GN的位置相同；第二种方法将学习的政策概括为GN的任何安排，通过更改GN的位置，每次培训情节后。数值评估表明，忽视Peukert效应高估了PAP的播放时间，可以通过最佳选择PAP的飞行速度来解决。此外，用户公平，能源效率，因此可以通过有效地将PAP移动到GN上方，从而提高系统的费用价值。因此，我们注意到郊区，城市和茂密的城市环境的基线情景高达88.31％，272.34％和318.13％。

translated by 谷歌翻译

State and parameter learning with PaRIS particle Gibbs

Gabriel Cardoso , Yazid Janati El Idrissi , Sylvain Le Corff , Eric Moulines , Jimmy Olsson

分类： (统计)机器学习

2023-01-02

Non-linear state-space models, also known as general hidden Markov models, are ubiquitous in statistical machine learning, being the most classical generative models for serial data and sequences in general. The particle-based, rapid incremental smoother PaRIS is a sequential Monte Carlo (SMC) technique allowing for efficient online approximation of expectations of additive functionals under the smoothing distribution in these models. Such expectations appear naturally in several learning contexts, such as likelihood estimation (MLE) and Markov score climbing (MSC). PARIS has linear computational complexity, limited memory requirements and comes with non-asymptotic bounds, convergence results and stability guarantees. Still, being based on self-normalised importance sampling, the PaRIS estimator is biased. Our first contribution is to design a novel additive smoothing algorithm, the Parisian particle Gibbs PPG sampler, which can be viewed as a PaRIS algorithm driven by conditional SMC moves, resulting in bias-reduced estimates of the targeted quantities. We substantiate the PPG algorithm with theoretical results, including new bounds on bias and variance as well as deviation inequalities. Our second contribution is to apply PPG in a learning framework, covering MLE and MSC as special examples. In this context, we establish, under standard assumptions, non-asymptotic bounds highlighting the value of bias reduction and the implicit Rao--Blackwellization of PPG. These are the first non-asymptotic results of this kind in this setting. We illustrate our theoretical results with numerical experiments supporting our claims.

translated by 谷歌翻译

Robust Cross-vendor Mammographic Texture Models Using Augmentation-based Domain Adaptation for Long-term Breast Cancer Risk

Andreas D. Lauritzen , My Catarina von Euler-Chelpin , Elsebeth Lynge , Ilse Vejborg , Mads Nielsen , Nico Karssemeijer , Martin Lillholm

分类：计算机视觉

2022-12-27

The future of population-based breast cancer screening is likely personalized strategies based on clinically relevant risk models. Mammography-based risk models should remain robust to domain shifts caused by different populations and mammographic devices. Modern risk models do not ensure adaptation across vendor-domains and are often conflated to unintentionally rely on both precursors of cancer and systemic/global mammographic information associated with short- and long-term risk, respectively, which might limit performance. We developed a robust, cross-vendor model for long-term risk assessment. An augmentation-based domain adaption technique, based on flavorization of mammographic views, ensured generalization to an unseen vendor-domain. We trained on samples without diagnosed/potential malignant findings to learn systemic/global breast tissue features, called mammographic texture, indicative of future breast cancer. However, training so may cause erratic convergence. By excluding noise-inducing samples and designing a case-control dataset, a robust ensemble texture model was trained. This model was validated in two independent datasets. In 66,607 Danish women with flavorized Siemens views, the AUC was 0.71 and 0.65 for prediction of interval cancers within two years (ICs) and from two years after screening (LTCs), respectively. In a combination with established risk factors, the model's AUC increased to 0.68 for LTCs. In 25,706 Dutch women with Hologic-processed views, the AUCs were not different from the AUCs in Danish women with flavorized views. The results suggested that the model robustly estimated long-term risk while adapting to an unseen processed vendor-domain. The model identified 8.1% of Danish women accounting for 20.9% of ICs and 14.2% of LTCs.

translated by 谷歌翻译

Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks

Jimmy Z. Di , Jack Douglas , Jayadev Acharya , Gautam Kamath , Ayush Sekhari

分类：机器学习 | 人工智能

2022-12-21

We introduce camouflaged data poisoning attacks, a new attack vector that arises in the context of machine unlearning and other settings when model retraining may be induced. An adversary first adds a few carefully crafted points to the training dataset such that the impact on the model's predictions is minimal. The adversary subsequently triggers a request to remove a subset of the introduced points at which point the attack is unleashed and the model's predictions are negatively affected. In particular, we consider clean-label targeted attacks (in which the goal is to cause the model to misclassify a specific test point) on datasets including CIFAR-10, Imagenette, and Imagewoof. This attack is realized by constructing camouflage datapoints that mask the effect of a poisoned dataset.

translated by 谷歌翻译

Precise Zero-Shot Dense Retrieval without Relevance Labels

Luyu Gao , Xueguang Ma , Jimmy Lin , Jamie Callan

分类：自然语言处理

2022-12-20

While dense retrieval has been shown effective and efficient across tasks and languages, it remains difficult to create effective fully zero-shot dense retrieval systems when no relevance label is available. In this paper, we recognize the difficulty of zero-shot learning and encoding relevance. Instead, we propose to pivot through Hypothetical Document Embeddings~(HyDE). Given a query, HyDE first zero-shot instructs an instruction-following language model (e.g. InstructGPT) to generate a hypothetical document. The document captures relevance patterns but is unreal and may contain false details. Then, an unsupervised contrastively learned encoder~(e.g. Contriever) encodes the document into an embedding vector. This vector identifies a neighborhood in the corpus embedding space, where similar real documents are retrieved based on vector similarity. This second step ground the generated document to the actual corpus, with the encoder's dense bottleneck filtering out the incorrect details. Our experiments show that HyDE significantly outperforms the state-of-the-art unsupervised dense retriever Contriever and shows strong performance comparable to fine-tuned retrievers, across various tasks (e.g. web search, QA, fact verification) and languages~(e.g. sw, ko, ja).

translated by 谷歌翻译

Less is More: Parameter-Free Text Classification with Gzip

Zhiying Jiang , Matthew Y. R. Yang , Mikhail Tsirlin , Raphael Tang , Jimmy Lin

分类：自然语言处理

2022-12-19

Deep neural networks (DNNs) are often used for text classification tasks as they usually achieve high levels of accuracy. However, DNNs can be computationally intensive with billions of parameters and large amounts of labeled data, which can make them expensive to use, to optimize and to transfer to out-of-distribution (OOD) cases in practice. In this paper, we propose a non-parametric alternative to DNNs that's easy, light-weight and universal in text classification: a combination of a simple compressor like gzip with a $k$-nearest-neighbor classifier. Without any training, pre-training or fine-tuning, our method achieves results that are competitive with non-pretrained deep learning methods on six in-distributed datasets. It even outperforms BERT on all five OOD datasets, including four low-resource languages. Our method also performs particularly well in few-shot settings where labeled data are too scarce for DNNs to achieve a satisfying accuracy.

translated by 谷歌翻译

Surrogate-assisted level-based learning evolutionary search for heat extraction optimization of enhanced geothermal system

Guodong Chen , Xin Luo , Chuanyin Jiang , Jiu Jimmy Jiao

分类：神经与进化计算

2022-12-15

An enhanced geothermal system is essential to provide sustainable and long-term geothermal energy supplies and reduce carbon emissions. Optimal well-control scheme for effective heat extraction and improved heat sweep efficiency plays a significant role in geothermal development. However, the optimization performance of most existing optimization algorithms deteriorates as dimension increases. To solve this issue, a novel surrogate-assisted level-based learning evolutionary search algorithm (SLLES) is proposed for heat extraction optimization of enhanced geothermal system. SLLES consists of classifier-assisted level-based learning pre-screen part and local evolutionary search part. The cooperation of the two parts has realized the balance between the exploration and exploitation during the optimization process. After iteratively sampling from the design space, the robustness and effectiveness of the algorithm are proven to be improved significantly. To the best of our knowledge, the proposed algorithm holds state-of-the-art simulation-involved optimization framework. Comparative experiments have been conducted on benchmark functions, a two-dimensional fractured reservoir and a three-dimensional enhanced geothermal system. The proposed algorithm outperforms other five state-of-the-art surrogate-assisted algorithms on all selected benchmark functions. The results on the two heat extraction cases also demonstrate that SLLES can achieve superior optimization performance compared with traditional evolutionary algorithm and other surrogate-assisted algorithms. This work lays a solid basis for efficient geothermal extraction of enhanced geothermal system and sheds light on the model management strategies of data-driven optimization in the areas of energy exploitation.

translated by 谷歌翻译

Improving Precancerous Case Characterization via Transformer-based Ensemble Learning

Yizhen Zhong , Jiajie Xiao , Thomas Vetterli , Mahan Matin , Ellen Loo , Jimmy Lin , Richard Bourgon , Ofer Shapira

分类：机器学习

2022-12-10

The application of natural language processing (NLP) to cancer pathology reports has been focused on detecting cancer cases, largely ignoring precancerous cases. Improving the characterization of precancerous adenomas assists in developing diagnostic tests for early cancer detection and prevention, especially for colorectal cancer (CRC). Here we developed transformer-based deep neural network NLP models to perform the CRC phenotyping, with the goal of extracting precancerous lesion attributes and distinguishing cancer and precancerous cases. We achieved 0.914 macro-F1 scores for classifying patients into negative, non-advanced adenoma, advanced adenoma and CRC. We further improved the performance to 0.923 using an ensemble of classifiers for cancer status classification and lesion size named entity recognition (NER). Our results demonstrated the potential of using NLP to leverage real-world health record data to facilitate the development of diagnostic tests for early cancer prevention.

translated by 谷歌翻译

Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve

Juhan Bae , Michael R. Zhang , Michael Ruan , Eric Wang , So Hasegawa , Jimmy Ba , Roger Grosse

分类：机器学习 | 人工智能 | (统计)机器学习

2022-12-07

Variational autoencoders (VAEs) are powerful tools for learning latent representations of data used in a wide range of applications. In practice, VAEs usually require multiple training rounds to choose the amount of information the latent variable should retain. This trade-off between the reconstruction error (distortion) and the KL divergence (rate) is typically parameterized by a hyperparameter $\beta$. In this paper, we introduce Multi-Rate VAE (MR-VAE), a computationally efficient framework for learning optimal parameters corresponding to various $\beta$ in a single training run. The key idea is to explicitly formulate a response function that maps $\beta$ to the optimal parameters using hypernetworks. MR-VAEs construct a compact response hypernetwork where the pre-activations are conditionally gated based on $\beta$. We justify the proposed architecture by analyzing linear VAEs and showing that it can represent response functions exactly for linear VAEs. With the learned hypernetwork, MR-VAEs can construct the rate-distortion curve without additional training and can be deployed with significantly less hyperparameter tuning. Empirically, our approach is competitive and often exceeds the performance of multiple $\beta$-VAEs training with minimal computation and memory overheads.

translated by 谷歌翻译

Transfer Learning with Uncertainty Quantification: Random Effect Calibration of Source to Target (RECaST)

Jimmy Hickey , Jonathan P. Williams , Emily C. Hector

分类： (统计)机器学习

2022-11-29

Transfer learning uses a data model, trained to make predictions or inferences on data from one population, to make reliable predictions or inferences on data from another population. Most existing transfer learning approaches are based on fine-tuning pre-trained neural network models, and fail to provide crucial uncertainty quantification. We develop a statistical framework for model predictions based on transfer learning, called RECaST. The primary mechanism is a Cauchy random effect that recalibrates a source model to a target population; we mathematically and empirically demonstrate the validity of our RECaST approach for transfer learning between linear models, in the sense that prediction sets will achieve their nominal stated coverage, and we numerically illustrate the method's robustness to asymptotic approximations for nonlinear models. Whereas many existing techniques are built on particular source models, RECaST is agnostic to the choice of source model. For example, our RECaST transfer learning approach can be applied to a continuous or discrete data model with linear or logistic regression, deep neural network architectures, etc. Furthermore, RECaST provides uncertainty quantification for predictions, which is mostly absent in the literature. We examine our method's performance in a simulation study and in an application to real hospital data.

translated by 谷歌翻译